Persistent Homology for the Evaluation of Dimensionality Reduction Schemes

نویسندگان

  • Bastian Rieck
  • Heike Leitte
چکیده

High-dimensional data sets are a prevalent occurrence in many application domains. This data is commonly visualized using dimensionality reduction (DR) methods. DR methods provide e.g. a two-dimensional embedding of the abstract data that retains relevant high-dimensional characteristics such as local distances between data points. Since the amount of DR algorithms from which users may choose is steadily increasing, assessing their quality becomes more and more important. We present a novel technique to quantify and compare the quality of DR algorithms that is based on persistent homology. An inherent beneficial property of persistent homology is its robustness against noise which makes it well suited for real world data. Our pipeline informs about the best DR technique for a given data set and chosen metric (e.g. preservation of local distances), and provides information about the local quality of an embedding that help users understand the shortcomings of the selected DR method. The utility of our method is demonstrated using application data from multiple domains and a variety of commonly used DR methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Persistent homology for low-complexity models

We show that recent results on randomized dimension reduction schemes that exploit structural properties of data can be applied in the context of persistent homology. In the spirit of compressed sensing, the dimension reduction is determined by the Gaussian width of a structure associated to the data set, rather than its size, and such a reduction can be computed efficiently. We further relate ...

متن کامل

Impact of linear dimensionality reduction methods on the performance of anomaly detection algorithms in hyperspectral images

Anomaly Detection (AD) has recently become an important application of hyperspectral images analysis. The goal of these algorithms is to find the objects in the image scene which are anomalous in comparison to their surrounding background. One way to improve the performance and runtime of these algorithms is to use Dimensionality Reduction (DR) techniques. This paper evaluates the effect of thr...

متن کامل

2D Dimensionality Reduction Methods without Loss

In this paper, several two-dimensional extensions of principal component analysis (PCA) and linear discriminant analysis (LDA) techniques has been applied in a lossless dimensionality reduction framework, for face recognition application. In this framework, the benefits of dimensionality reduction were used to improve the performance of its predictive model, which was a support vector machine (...

متن کامل

Comparing Dimensionality Reduction Methods Using Data Descriptor Landscapes

Dimensionality reduction (DR) methods are commonly used in data science to turn high-dimensional data into 2D representations. Since data sets contain different structural features that need to be preserved by this process, there is a multitude of DR methods, each geared towards preserving a separate aspect. This makes choosing a suitable algorithm for a given data set a challenging task. In th...

متن کامل

Persistent Homology Analysis of RNA

Topological data analysis hasbeen recentlyused to extractmeaningful information frombiomolecules. Here we introduce the application of persistent homology, a topological data analysis tool, for computing persistent features (loops) of the RNA folding space. The scaffold of the RNA folding space is a complex graph from which the global features are extracted by completing the graph to a simplici...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Comput. Graph. Forum

دوره 34  شماره 

صفحات  -

تاریخ انتشار 2015